A dynamic programming algorithm for binning microbial community profiles

نویسندگان

  • Quansong Ruan
  • Joshua A. Steele
  • Michael S. Schwalbach
  • Jed A. Fuhrman
  • Fengzhu Sun
چکیده

MOTIVATION A number of community profiling approaches have been widely used to study the microbial community composition and its variations in environmental ecology. Automated Ribosomal Intergenic Spacer Analysis (ARISA) is one such technique. ARISA has been used to study microbial communities using 16S-23S rRNA intergenic spacer length heterogeneity at different times and places. Owing to errors in sampling, random mutations in PCR amplification, and probably mostly variations in readings from the equipment used to analyze fragment sizes, the data read directly from the fragment analyzer should not be used for down stream statistical analysis. No optimal data preprocessing methods are available. A commonly used approach is to bin the reading lengths of the 16S-23S intergenic spacer. We have developed a dynamic programming algorithm based binning method for ARISA data analysis which minimizes the overall differences between replicates from the same sampling location and time. RESULTS In a test example from an ocean time series sampling program, data preprocessing identified several outliers which upon re-examination were found to be because of systematic errors. Clustering analysis of the ARISA from different times based on the dynamic programming algorithm binned data revealed important features of the biodiversity of the microbial communities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Binning of Metagenomic Contigs for Microbial Physiology of Mixed Cultures

So far, microbial physiology has dedicated itself mainly to pure cultures. In nature, cross feeding and competition are important aspects of microbial physiology and these can only be addressed by studying complete communities such as enrichment cultures. Metagenomic sequencing is a powerful tool to characterize such mixed cultures. In the analysis of metagenomic data, well established algorith...

متن کامل

A Novel Abundance-Based Algorithm for Binning Metagenomic Sequences Using l-Tuples

Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. Among the computational tools recently developed for metagenomic sequence analysis, binning tools attempt to classify the sequences in a metagenomic dataset into different bins (i.e., species), based on various DNA composition patterns (e.g., the tetramer frequencies) of ...

متن کامل

Extracting Dynamics Matrix of Alignment Process for a Gimbaled Inertial Navigation System Using Heuristic Dynamic Programming Method

In this paper, with the aim of estimating internal dynamics matrix of a gimbaled Inertial Navigation system (as a discrete Linear system), the discretetime Hamilton-Jacobi-Bellman (HJB) equation for optimal control has been extracted. Heuristic Dynamic Programming algorithm (HDP) for solving equation has been presented and then a neural network approximation for cost function and control input ...

متن کامل

A New Optimization via Invasive Weeds Algorithm for Dynamic Facility Layout Problem

The dynamic facility layout problem (DFLP) is the problem of finding positions of departments onthe plant floor for multiple periods (material flows between departments change during the planning horizon)such that departments do not overlap, and the sum of the material handling and rearrangement costs isminimized. In this paper a new optimization algorithm inspired from colonizing weeds, Invasi...

متن کامل

An Efficient Algorithm for Reducing the Duality Gap in a Special Class of the Knapsack Problem

A special class of the knapsack problem is called the separable nonlinear knapsack problem. This problem has received considerable attention recently because of its numerous applications. Dynamic programming is one of the basic approaches for solving this problem. Unfortunately, the size of state-pace will dramatically increase and cause the dimensionality problem. In this paper, an efficient a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 22 12  شماره 

صفحات  -

تاریخ انتشار 2006